25 research outputs found

    Who is the director of this movie? Automatic style recognition based on shot features

    Get PDF
    We show how low-level formal features, such as shot duration, meant as length of camera takes, and shot scale, i.e. the distance between the camera and the subject, are distinctive of a director's style in art movies. So far such features were thought of not having enough varieties to become distinctive of an author. However our investigation on the full filmographies of six different authors (Scorsese, Godard, Tarr, Fellini, Antonioni, and Bergman) for a total number of 120 movies analysed second by second, confirms that these shot-related features do not appear as random patterns in movies from the same director. For feature extraction we adopt methods based on both conventional and deep learning techniques. Our findings suggest that feature sequential patterns, i.e. how features evolve in time, are at least as important as the related feature distributions. To the best of our knowledge this is the first study dealing with automatic attribution of movie authorship, which opens up interesting lines of cross-disciplinary research on the impact of style on the aesthetic and emotional effects on the viewers

    Deep Learning Meets Hyperspectral Image Analysis: A Multidisciplinary Review

    Get PDF
    Modern hyperspectral imaging systems produce huge datasets potentially conveying a great abundance of information; such a resource, however, poses many challenges in the analysis and interpretation of these data. Deep learning approaches certainly offer a great variety of opportunities for solving classical imaging tasks and also for approaching new stimulating problems in the spatial–spectral domain. This is fundamental in the driving sector of Remote Sensing where hyperspectral technology was born and has mostly developed, but it is perhaps even more true in the multitude of current and evolving application sectors that involve these imaging technologies. The present review develops on two fronts: on the one hand, it is aimed at domain professionals who want to have an updated overview on how hyperspectral acquisition techniques can combine with deep learning architectures to solve specific tasks in different application fields. On the other hand, we want to target the machine learning and computer vision experts by giving them a picture of how deep learning technologies are applied to hyperspectral data from a multidisciplinary perspective. The presence of these two viewpoints and the inclusion of application fields other than Remote Sensing are the original contributions of this review, which also highlights some potentialities and critical issues related to the observed development trends

    Fighting the scanner effect in brain MRI segmentation with a progressive level-of-detail network trained on multi-site data

    Full text link
    Many clinical and research studies of the human brain require an accurate structural MRI segmentation. While traditional atlas-based methods can be applied to volumes from any acquisition site, recent deep learning algorithms ensure very high accuracy only when tested on data from the same sites exploited in training (i.e., internal data). The performance degradation experienced on external data (i.e., unseen volumes from unseen sites) is due to the inter-site variabilities in intensity distributions induced by different MR scanner models, acquisition parameters, and unique artefacts. To mitigate this site-dependency, often referred to as the scanner effect, we propose LOD-Brain, a 3D convolutional neural network with progressive levels-of-detail (LOD) able to segment brain data from any site. Coarser network levels are responsible to learn a robust anatomical prior useful for identifying brain structures and their locations, while finer levels refine the model to handle site-specific intensity distributions and anatomical variations. We ensure robustness across sites by training the model on an unprecedented rich dataset aggregating data from open repositories: almost 27,000 T1w volumes from around 160 acquisition sites, at 1.5 - 3T, from a population spanning from 8 to 90 years old. Extensive tests demonstrate that LOD-Brain produces state-of-the-art results, with no significant difference in performance between internal and external sites, and robust to challenging anatomical variations. Its portability opens the way for large scale application across different healthcare institutions, patient populations, and imaging technology manufacturers. Code, model, and demo are available at the project website

    Fighting the scanner effect in brain MRI segmentation with a progressive level-of-detail network trained on multi-site data

    Get PDF
    Many clinical and research studies of the human brain require accurate structural MRI segmentation. While traditional atlas-based methods can be applied to volumes from any acquisition site, recent deep learning algorithms ensure high accuracy only when tested on data from the same sites exploited in training (i.e., internal data). Performance degradation experienced on external data (i.e., unseen volumes from unseen sites) is due to the inter-site variability in intensity distributions, and to unique artefacts caused by different MR scanner models and acquisition parameters. To mitigate this site-dependency, often referred to as the scanner effect, we propose LOD-Brain, a 3D convolutional neural network with progressive levels-of-detail (LOD), able to segment brain data from any site. Coarser network levels are responsible for learning a robust anatomical prior helpful in identifying brain structures and their locations, while finer levels refine the model to handle site-specific intensity distributions and anatomical variations. We ensure robustness across sites by training the model on an unprecedentedly rich dataset aggregating data from open repositories: almost 27,000 T1w volumes from around 160 acquisition sites, at 1.5 - 3T, from a population spanning from 8 to 90 years old. Extensive tests demonstrate that LOD-Brain produces state-of-the-art results, with no significant difference in performance between internal and external sites, and robust to challenging anatomical variations. Its portability paves the way for large-scale applications across different healthcare institutions, patient populations, and imaging technology manufacturers. Code, model, and demo are available on the project website

    Transfer learning of deep neural network representations for fMRI decoding

    Get PDF
    Background: Deep neural networks have revolutionised machine learning, with unparalleled performance in object classification. However, in brain imaging (e.g., fMRI), the direct application of Convolutional Neural Networks (CNN) to decoding subject states or perception from imaging data seems impractical given the scarcity of available data. New method: In this work we propose a robust method to transfer information from deep learning (DL) features to brain fMRI data with the goal of decoding. By adopting Reduced Rank Regression with Ridge Regularisation we establish a multivariate link between imaging data and the fully connected layer (fc7) of a CNN. We exploit the reconstructed fc7 features by performing an object image classification task on two datasets: one of the largest fMRI databases, taken from different scanners from more than two hundred subjects watching different movie clips, and another with fMRI data taken while watching static images. Results: The fc7 features could be significantly reconstructed from the imaging data, and led to significant decoding performance. Comparison with existing methods: The decoding based on reconstructed fc7 outperformed the decoding based on imaging data alone. Conclusion: In this work we show how to improve fMRI-based decoding benefiting from the mapping between functional data and CNN features. The potential advantage of the proposed method is twofold: the extraction of stimuli representations by means of an automatic procedure (unsupervised) and the embedding of high-dimensional neuroimaging data onto a space designed for visual object discrimination, leading to a more manageable space from dimensionality point of view

    ECG waveform dataset for predicting defibrillation outcome in out-of-hospital cardiac arrested patients

    Get PDF
    The provided database of 260 ECG signals was collected from patients with out-of-hospital cardiac arrest while treated by the emergency medical services. Each ECG signal contains a 9 second waveform showing ventricular fibrillation, followed by 1 min of post-shock waveform. Patients’ ECGs are made available in multiple formats. All ECGs recorded during the prehospital treatment are provided in PFD files, after being anonymized, printed in paper, and scanned. For each ECG, the dataset also includes the whole digitized waveform (9 s pre- and 1 min post-shock each) and numerous features in temporal and frequency domain extracted from the 9 s episode immediately prior to the first defibrillation shock. Based on the shock outcome, each ECG file has been annotated by three expert cardiologists, - using majority decision -, as successful (56 cases), unsuccessful (195 cases), or indeterminable (9 cases). The code for preprocessing, for feature extraction, and for limiting the investigation to different temporal intervals before the shock is also provided. These data could be reused to design algorithms to predict shock outcome based on ventricular fibrillation analysis, with the goal to optimize the defibrillation strategy (immediate defibrillation versus cardiopulmonary resuscitation and/or drug administration) for enhancing resuscitation. © 202

    Assessing Trustworthy AI in times of COVID-19. Deep Learning for predicting a multi-regional score conveying the degree of lung compromise in COVID-19 patients

    Get PDF
    Abstract—The paper's main contributions are twofold: to demonstrate how to apply the general European Union’s High-Level Expert Group’s (EU HLEG) guidelines for trustworthy AI in practice for the domain of healthcare; and to investigate the research question of what does “trustworthy AI” mean at the time of the COVID-19 pandemic. To this end, we present the results of a post-hoc self-assessment to evaluate the trustworthiness of an AI system for predicting a multi-regional score conveying the degree of lung compromise in COVID-19 patients, developed and verified by an interdisciplinary team with members from academia, public hospitals, and industry in time of pandemic. The AI system aims to help radiologists to estimate and communicate the severity of damage in a patient’s lung from Chest X-rays. It has been experimentally deployed in the radiology department of the ASST Spedali Civili clinic in Brescia (Italy) since December 2020 during pandemic time. The methodology we have applied for our post-hoc assessment, called Z-Inspection®, uses socio-technical scenarios to identify ethical, technical and domain-specific issues in the use of the AI system in the context of the pandemic.</p

    Cross-domain assessment of deep learning-based alignment solutions for real-time 3D reconstruction

    No full text
    Interesting deep learning solutions have been proposed recently to address different tasks along the 3D view alignment pipeline. However, a direct comparison among these technologies is still lacking, while their (possibly combined) potentials have yet to be extensively tested. This is especially true in cases where the focus is directed on diversified data and/or specific application requirements, such as the ones emerging in real-time 3D object reconstruction scenarios. This work is a first contribution in this direction since we perform an independent and extended comparison of the main deep learning-driven 3D view alignment solutions. We consider two relevant data types: data coming from commodity 3D sensors targeting indoor reconstruction applications, and denser data coming from a handheld 3D optical scanner, typically used for small-scale object reconstruction. While for the first scenario we refer to existing datasets, for the second setup we work on a new benchmarking dataset, namely DenseMatch. We run performance tests and extended comparisons, with different system configurations including model refinements, and we found solid evidence that the generalizability performance of deep learning systems for 3D alignment is critically linked to data features. Finally, we design and test the first integration of deep learning solutions into a baseline method for real-time 3D reconstruction, clearly demonstrating improved effectiveness in addressing and solving typical tracking and scan interruption issues arising in these demanding scenarios
    corecore